Compile time benchmarking by MaxGabriel · Pull Request #1100 · yesodweb/persistent

MaxGabriel · 2020-07-20T00:03:14Z

This is a draft PR of some work I did to measure compile times. I've had it sitting around for awhile so decided to draft PR it, just in case someone could use it.

The general idea is that Persistent suffers from slow compile times, because its template Haskell code generates very large amounts of code. This causes issues both in development and production for users, especially because Persistent models are likely a fairly "root" dependency in many codebases.

There are a number of changes Persistent could potentially make to reduce compile times, for example deriving fewer instances for keys.

This PR takes the approach of having a sample project that primarily consists of a large .persistentmodels file, which is one of the files Mercury uses in production (with modifications). We benchmark compiling it in two ways:

We use the bench CLI program, which is a wrapper around criterion. This gives us the usual benefits of criterion like statistical measurements in our benchmarks. The downside it it measures the full compilation time, not just desired module.
We compile this project with -ddump-timings and ddump-to-file when benchmarking, then on each build of the project, copy the file that has the timings for our models module to another directory. At the conclusion of the benchmarking, we use the timing files to get an average duration it took to compile our models.

Overall I think this is a good approach to benchmarking compilation time. It can be used with a variety of compiler settings (e.g. -O0 matters for development, but -O1 or -O2 for production). But it could use more sample projects that exercise different parts of Persistent (e.g. perhaps there is a performance degradation with models with 20+ fields—this current PR would not catch that).

MaxGabriel · 2020-07-20T00:06:44Z

As a small example, removing just the Servant typeclasses from being derived for keys shaves several hundred milliseconds off compilation. This is fairly significant given that an e.g. Yesod project has no need for these instances, and the result would be higher if done to more models files.

~/D/C/H/y/p/p/compile-time-testing> ruby add-timings.rb projects/Mercury/results-no-servant2                                                                                                            20:04:26
/System/Library/Frameworks/Ruby.framework/Versions/2.6/usr/lib/ruby/2.6.0/universal-darwin19/rbconfig.rb:229: warning: Insecure world writable dir /usr/local/sbin in PATH, mode 040777
add-timings.rb:16: warning: assigned but unused variable - start
Looking for data in projects/Mercury/results-no-servant2
Mean is 9187.288562499998ms
~/D/C/H/y/p/p/compile-time-testing> ruby add-timings.rb projects/Mercury/results-no-servant                                                                                                             20:04:34
/System/Library/Frameworks/Ruby.framework/Versions/2.6/usr/lib/ruby/2.6.0/universal-darwin19/rbconfig.rb:229: warning: Insecure world writable dir /usr/local/sbin in PATH, mode 040777
add-timings.rb:16: warning: assigned but unused variable - start
Looking for data in projects/Mercury/results-no-servant
Mean is 9106.216874999998ms
~/D/C/H/y/p/p/compile-time-testing> ruby add-timings.rb projects/Mercury/results-baseline/                                                                                                              20:04:46
/System/Library/Frameworks/Ruby.framework/Versions/2.6/usr/lib/ruby/2.6.0/universal-darwin19/rbconfig.rb:229: warning: Insecure world writable dir /usr/local/sbin in PATH, mode 040777
add-timings.rb:16: warning: assigned but unused variable - start
Looking for data in projects/Mercury/results-baseline/
Mean is 9485.424687499997ms

MaxGabriel · 2020-07-26T20:52:37Z

Changing the definition of persistFieldDef on every model to error "todo" decreased build time to 8669.350124999999ms on average (combined with the no servant instances speedup). Possibly a good candidate for a speedup by calling into entityDef -> entityFields -> lookup the field instead of embedding the field's definition. Might incur higher costs at runtime to lookup fields though

parsonsmatt · 2021-05-11T14:19:05Z

I asked about how to do benchmarking in Q on reddit and got some good answers: https://www.reddit.com/r/haskell/comments/n75cd1/how_do_you_benchmark_q_template_haskell_functions/

MaxGabriel added 4 commits June 7, 2020 17:07

..

7566013

..

3b17b23

..

4bd8c36

..

20b6d91

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compile time benchmarking#1100

Compile time benchmarking#1100
MaxGabriel wants to merge 4 commits intomasterfrom
compileTimeTesting

MaxGabriel commented Jul 20, 2020

Uh oh!

MaxGabriel commented Jul 20, 2020

Uh oh!

MaxGabriel commented Jul 26, 2020

Uh oh!

parsonsmatt commented May 11, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

MaxGabriel commented Jul 20, 2020

Uh oh!

MaxGabriel commented Jul 20, 2020

Uh oh!

MaxGabriel commented Jul 26, 2020

Uh oh!

parsonsmatt commented May 11, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants